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INFERENCE FOR CENSORED QUANTILE REGRESSION 
MODELS IN LONGITUDINAL STUDIES 

By Huixia Judy Wang^ and Mendel Fygenson 

North Carolina State University and University of Southern California 

We develop inference procedures for longitudinal data where some 
of the measurements are censored by fixed constants. We consider a 
semi-parametric quantile regression model that makes no distribu- 
tional assumptions. Our research is motivated by the lack of proper 
inference procedures for data from biomedical studies where measure- 
ments are censored due to a fixed quantification limit. In such studies 
the focus is often on testing hypotheses about treatment equality. To 
this end, we propose a rank score test for large sample inference on a 
subset of the covariates. We demonstrate the importance of account- 
ing for both censoring and intra-subject dependency and evaluate 
the performance of our proposed methodology in a simulation study. 
We then apply the proposed inference procedures to data from an 
AIDS-related clinical trial. We conclude that our framework and pro- 
posed methodology is very valuable for differentiating the infiuences 
of predictors at difi'erent locations in the conditional distribution of 
a response variable. 

1. Introduction. Longitudinal studies, in which repeated measurements 
are made on the same subject, are common in many areas of research. How- 
ever, proper quantile inference procedures have not been established for 
longitudinal data in which some responses are left censored. This occurs, 
for example, when assessing the concentration of a pollutant in the environ- 
ment [27], the antibody concentration in blood serum [22] or the amount of 
viral RNA (i.e., viral load) in individuals infected with Human Immunode- 
ficiency Virus (HIV) [15]. In such cases, left censoring is typically due to the 
detection limit of the diagnostic assay. 

In this paper, we consider inferences in a quantile regression setup where 
some of the responses are censored by fixed values and where repeated mea- 
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surements may be taken at different points across subjects. We anchor our 
investigation to an AIDS-related clinical trial since many approaches pro- 
posed for dealing with left censoring in longitudinal studies have been ap- 
plied to such data. 

Viral load is a measure of the amount of actively replicating virus and is 
used as a marker of disease progression among HIV-infected people. Viral 
load measurements are often subject to left censoring due to a lower limit 
of quantification. The detection limit depends upon the assay used, ranging 
from 500 copies/ml for the first assays available in the mid-nineties to 50 
copies/ml for today's ultrasensitive assay. Despite the improvement in as- 
say sensitivity, left censoring remains a critical issue because anti-retroviral 
treatments have become so effective as to lead to a steep decrease of HIV- 
RNA after their initiation. 

Studies that measure HIV-RNA commonly incorporate repeated measure- 
ments in order to (1) control for variation among individuals and (2) monitor 
temporal changes in viral load during treatment. Characterization of viral 
dynamics in patients with different treatment regimens is essential to further 
development of treatments and evaluation of their efficacy (e.g., [5]). 

In the medical statistical literature several methods have been proposed to 
handle the left censoring of HIV-RNA data. These include crude methods 
that use either the threshold value or some arbitrary point, such as the 
mid-point between zero and the cut off for the detection (e.g., [14]). These 
approaches usually lead to biased predictions that are systematically higher 
than predictions based on the true unknown values below the cut-off [8]. 

Other researchers considered mixed models and many applied a likelihood- 
based approach while assuming Gaussian distribution for both random ef- 
fects and random errors; see, for example, [15, 16, 21, 32]. Chu et al. [4] 
considered mixture models to study the correlation between a pair of vi- 
ral load measurements from each of a sample of patients assuming bivari- 
ate normal distributions. Compared to simple imputation, likelihood-based 
methods produce estimators that are less biased but with higher standard 
deviations. Even though the normality assumption eases mathematical com- 
plications, it may be unrealistic as viral load measurements are known to 
be highly skewed to the right, even after log transformation; see [7, 13]. 
Some nonlikelihood-based approaches include Sun and Wu [28], which con- 
sidered a regression model with semi-parametric time- varying coefficients, 
and Hogan and Lee [13], which studied marginal structural quantile models 
with time-varying treatments. The former paper ignored the left censoring 
of the viral load measurements, and the latter replaced the censored values 
with a random generated number between zero and the detection limit. 

In general, discarding censored measurements or ignoring them as such 
leads to biased inferences. Treating longitudinal data as independent obser- 
vations can result in wrong nominal levels and/or power loss in testing hy- 
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potheses. In what follows, we develop inference procedures within the semi- 
parametric framework of quantile regression. In our analysis, we examine 
and account for both the effects of fixed censoring in the dependent variable 
and the longitudinal nature of the observations. Since semi-parametric quan- 
tile regression models impose minimal assumptions on the error term, our 
resulting inference procedures are robust to distributional misspecifications 
and most appropriate for applications with extremely skewed observations. 
When the censored response variable is non-Gaussian, a traditional regres- 
sion approach, which captures changes in the conditional mean, may not 
effectively detect changes in the conditional distribution. This can be criti- 
cal in applications where the upper/lower quantiles of the response variable 
may relate differently to the covariates, leading to differing assessments of a 
factor's importance or a treatment's efficacy. 

Powell [24, 25] pioneered inference procedures for quantile regression with 
fixed censoring. Bilias, Chen and Ying [2] proposed a re-sampling-based in- 
ference procedure by convexifying Powell's estimator in the resampling stage. 
Later, Zhao [35] discussed several median inferential methods. Ying, Jung 
and Wei [34] and Portnoy [23] provided quantile estimation procedures for 
random censoring. While other papers have been written on censored quan- 
tile regression, all existing inferential methods are developed for independent 
observations. 

In this paper, we develop large sample inference procedures for longitudi- 
nal data. Our focus is on testing hypotheses about treatment equality and 
covariate significance in quantile regression models. When proposing test 
statistics, one may either explore the asymptotic normality of the estimated 
coefficients or apply likelihood ratio-based tests. However, the former re- 
quires estimating the corresponding variance-covariance matrix, which is a 
challenge in our semi-parametric framework because the variance-covariance 
matrix is a function of the unspecified densities of error terms. The latter is, 
in general, difficult to develop for quantile regression, and even more so in 
our framework because the limiting distribution takes a complicated form 
involving the unknown error density function. We therefore extend the rank 
score test proposed in [9] to our setting and study its local power theoreti- 
cally and through simulations. A similar testing approach was successfully 
implemented in [30] in the context of conditional growth charts and in [29] 
for detecting differential expressions in GeneChip micoarray data. 

This paper is organized as follows: In the next section we introduce no- 
tation, review various models, provide the large sample properties of the 
corresponding estimators, present the rank score test, and discuss the con- 
struction of confidence intervals. In Section 3 we report results from a sim- 
ulation study comparing our method with two nai've methods and a boot- 
strap method. In Section 4, we demonstrate our method through analysis of 
HIV-RNA data from an AIDS clinical trial study. In Section 5, we discuss 
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the merits of our methodology and outhne future research topics. Technical 
proofs of the theorems and other lemmas are relegated to the Appendix. 

2. Estimation and proposed test. 

2.1. Model setup and notation. Longitudinal studies are typically char- 
acterized by a large number of subjects, A^, that are each measured a rela- 
tively small number of times, n^, resulting in a total of n = X^i^i^i obser- 
vations. In this paper, we focus on cases where some measurements are left 
censored at zero. However, the proposed procedures can easily be modified 
to accommodate censoring from the right and /or left as long as the censoring 
points are fixed. 

Let y^j denote the potentially left censored jth response of the ith subject, 
and let yij = max(0, y^j) be its corresponding observed values. We start with 
the following latent regression model: 

(2-1) Vij = xfjOo + zljfj + Uij, i = l,...,N, j = l,...,ni, 

where and Zij are the p x 1 and g x 1 design vectors, P P- 

and g-dimensional unknown parameters and Uij is the random error whose 
distribution may vary with {x,z). Throughout this paper, we assume that 
Uij are independent across i (subjects) but are dependent, via exchangeable 
correlation, within a subject. A typical example is the random intercept 
effect model with Uij = Oi + Cij, where Oj are i.i.d. random subject effects that 
are independent of the i.i.d. measurement errors Cjj. We further assume that 
the first element of Xij is 1, making the first component of qq an intercept. 

From (2.1) and for a given < r < 1, we consider the following left cen- 
sored quantile regression model: 

(2.2) yij = max{0,xJjao + z'[jf3 + Uij), i = 1, . . . , N,j = 1, . . . ,ni. 

We assume that the rth quantile of Uij is zero. Other than that, no distri- 
bution assumptions are made on u. 

Since a major motivation for our study is to develop procedures for com- 
paring HIV treatments within model (2.2), we consider testing the following 
hypotheses: 



where oq is unspecified and Pq G R"^ is fixed. This is equivalent to comparing 
the null model 



(2.3) 



Hq : /? = versus Hn : [3 = n 




(2.4) 




versus the local alternative model 



(2.5) 
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To derive the quantile estimate of oq in (2.4), we follow Powell [25] and 
consider the minimization of the objective function 

(2.6) Qn{a) = PriVij - max(0, x^a)}, 

where Priu) = u ■ {t — I{u < 0)} is the quantile loss function. Under mild 
conditions, it is established in Theorem 2.1 below that the quantile estimator 
SiQ is strongly consistent and asymptotically normal in both models (2.4) and 

(2.5) even though the objective function (2.6) treats all observations as if 
they were independent. 

In the absence of censored observations, minimization of (2.6) can be per- 
formed efficiently by linear programming techniques. In fact, the solution 
to (2.6) in such cases only requires software for quantile regression in lin- 
ear models. However, with censored observations, the objective function in 

(2.6) is neither differentiable nor convex, and this presents a computational 
challenge. A number of optimization approaches have been proposed in the 
literature; see, for example, [1, 6, 20]. In this paper we employ the BRECNS 
algorithm of [6] as implemented in the R package quantreg. 

2.2. Large sample properties of cxq. Throughout the paper we suppose 
a typical longitudinal data set where rij (the number of repeated measure- 
ments for each subject) is bounded, but (the number of subjects) grows. 
Note that all results are stated for a given r, although dependence on r 
is not explicit in the various expressions. To establish all the large sample 
properties in this paper, we require the following conditions: 

Al. The parameter vector ao is an interior point of a compact parameter 
space AgW. 

A2. Let denote the Euclidean norm of Xij, where Xij = {xjpzj^y , then 
maxjj II = 0(n-^/^) and n'^^^WxijW^ = 0{1) asn^oo. 

A3. There exists eo > such that as n — > oo, liminf ^j^- I{\xjja\ > Eq) > 
for any ||q|| / and i:>i„(Qo) = n~^Y.ij I{^Jj(^o > £o)xijxfj Di, 
where Di is a positive definite matrix. 

A4. The Uij have a common marginal distribution function F and a Lebesque 
density /, which is Lipschitz in a neighborhood of 0. Also, there ex- 
ist some positive values qi and Q2 such that f{u) < Q2 for all u, and 
/(n) > Qi for \u\ < Qi. 

A5. For any d > 0, there exists a positive constant C such that 
n-^E^,Ii\xJJao\<\\x,,\\d)<Cd. 

A6. Let D2n{ao) = iT-'^J^ij HxIj<^o > 0)xijz'[j D2, as n ^ cxd, where D2 
is a p X q matrix. 

A7. The joint distribution function of Uij-^ and Uij^ for any i and ji ^ j2, 
denoted as Fi^2, is Lipschitz in a neighborhood of (0,0). 
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A8. Let D3n{ao) = n ^ Hxfjao > D^, as n ^ oo. 

The following theorem states the large sample properties of qq under 
models (2.4) and (2.5): 

Theorem 2.1. For the longitudinal censored regression models (2.4) 
and (2.5): 

(i) // conditions Al~A4 hold, then the censored quantile estimator cxq 
converges to uq almost surely. 

(ii) If conditions A1-A5 hold, then under model (2.4) the censored quan- 
tile estimator ag is asymptotically normal, 

{r„(5)}-i/2^(ao - «o) ^ N{0,I). 

(iii) If conditions A1-A6 hold, then under model (2.5) the censored quan- 
tile estimator do is asymptotically normal, 

{r„(5)}-V2^(do - ao - D^'D2po) ^ N{0,I), 

where 

r„(<5)=n-i{/(0)}-2Z)-i 

X \ ^(^5"o > 0)xijxfjTil - r) 
I ij 

and 6 = P{uii < 0,Ui2 < 0) measures the intra-subject dependence. 

Remark 1. The common density assumption of Uij in A4 is made for 
convenience, but not necessary for the strong consistency nor the asymptotic 
normality of do- In order for Theorem 2.1 to hold, it suffices that the rth 
quantile of Uij is for all i and j with density functions fij , which are contin- 
uously differentiable in a neighborhood of zero and uniformly bounded away 
from zero and infinity. When yij is left censored at some known values Cij, 
the asymptotic results developed in this paper hold, but xfjOQ in conditions 
A3, A5, A6 and A8 must be replaced with xfjOo — Cij. 

2.3. Quantile rank score test. To test the hypotheses in (2.3), one can 
explore the asymptotic normality of censored quantile estimators of the pa- 
rameters (aO)/3) in model (2.2). However, following the proof of Theorem 
2.1 part (ii), one can see that the asymptotic variance-covariance matrix 
of these estimators is a function of the unspecified density of error terms. 
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This hampers the use of a Wald-type test. Moreover, it has been shown 
that, in a quantile regression set up, a Wald-type test is generally unstable 
at small sample sizes (e.g., [3, 18]). The use of likelihood ratio-based tests 
is even more daunting for our setup because the limiting distribution is a 
complicated function of the unknown error density. To avoid these problems 
and the need for estimating a density, which is in our testing problem an 
infinite dimensional nuisance parameter, we turn to the quantile rank score 
test proposed in [9] for independent and uncensored data. 
To present our test, we rewrite model (2.2) in matrix form 

(2.7) Y = m&x{0„,,Xao + Z(3 + U), 

where Y and U are n-dimensional vectors, 0„ is an n x 1 vector consist- 
ing of zeros and X and Z are n x p and n x q matrices, respectively. 
Let X* = diag{/(Xao > 0)}X, H = X* {X*^ X*)-^ X*^ and Z* = {z*j)nxg = 
(I — H)Z. Note that Z* , which is a linear combination of the design matrix 
Z, is orthogonal to the space spanned by those Xjj's that satisfy xfjao > 0. 
Our proposed quantile rank score test is based on 

(2.8) Sn = ^{/(x^ao > 0)z*^y,^{uij)}, 

where Uij = yij — max(0, x^do), ao is the censored quantile estimator of ao 
in model (2.4), and ^Pt{u) = t — I{u < 0) is the quantile score function. It is 
worth pointing out that (firiu) is the piecewise gradient of the quantile loss 
function (u) , and that Sn only includes scores from those observations for 
which the corresponding xfjUo are estimated to be uncensored. 
Let 



Vn{5; do) = ^"M E ^(4-"o > 0)44' t(1 - ^) 

(2.9) 

+ E H^Ijao > 0, xjyao > 0)z*jZ*J^ {-t^ + <5) I , 

where S is defined in Theorem 2.1(iii). 

We define the Quantile Rank Score (QRS) test statistic as 

(2.10) Tn = S^{Vn{6;ao)}-^Sn, 

where 6 = L^^J^i.j^j' I{xJjO(o > 0,xfj,ao > 0)I{uij < 0,Uij' < 0) and L de- 
notes the total number of pairs of repeated measurements that are predicted 
to be uncensored. Note that when all the observations are uncensored and 
independent (i.e., 5 = and rii = 1), reduces to the QRS test-statistic 
proposed in [9]. 
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Theorem 2.2. Assume that conditions A1-A8 hold, then as n ^ oo, 
we have: 

(i) under Hq, the statistic Tn is asymptotically with q degrees of 
freedom; 

(ii) under Hn, Tn is asymptotically noncentral with q degrees of free- 
dom and with noncentrality parameter l3QD3[Vn{S;ao)]~^D^(3of'^{0). 

Remark 2. The joint probability 6 in (2.9) captures the sign correlation 
between errors from the same subject. When 6 G (T^,r] these errors are pos- 
itively correlated, when 5 G [0,r^) they are negatively correlated and when 
(5 = the errors are independent. Ignoring the intra-subject dependence 
leads to a test T*, say. Depending on 6 and/or Z, this test statistic is either 
invalid or lacks power. For illustration, consider the case where q = 1 and 
6£{t^,t]. Then we have 



K(r2; ao) E^, ^K«o > G)zIJ{t - r 



When testing the between-subject factor effect, for example in model (3.1) 
of the simulation study, the signs of z^-z*ji are positive for the same ith 

subject, ^^"(^/j^q-) > 1 and T* leads to inflated Type I errors. For a given 
significance level 6, the power of T* under is 



l-$<|Ze/2l 



/ K(T^;ao) _ 



I '/'V K(<5;ao) VK(<5;ao) 

where [in = n~^f{0) Y^^j I{xfjao > 0)z*jZijf3o, $ denotes the CDF of the stan- 
dard normal distribution and Zg is the upper 0th quantile of $. Therefore, 
when testing the within-subject factor effect, for example, we can have the 
signs of all z*,z*,, be negative for the same subject, thus < 1 and 

T* has diminished power. 



2.4. Construction of confidence intervals. 



2.4.1. Confidence intervals via inversion of rank score tests. The devel- 
oped rank score test can be extended to test Hq : P = Pq by simply rewrit- 
ing model (2.7). We denote jjij =yij — zfjPo. The fact that yij is censored 
at implies that ijij is censored from the left at —zJ^Pq. It is clear that 
under Hq, the rth quantile estimate of a can be obtained by minimizing 
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PriVij — ™ax(— 2:^^/3o, x^a)}. The quantile rank score test can be con- 
structed following the same procedure as in Section 2.3 by replacing y with 

y- 

For a quantile coefficient /? G M^, the confidence interval can be con- 
structed by inverting the rank score test. Using the fact that the test statistic 
Tn is convex in /3, we can obtain a 100(1 — 9)% confidence interval consisting 
of the /3o's at which the test on Hq ■.(3 = (3q will not be rejected. The readers 
are referred to [18, 19] or [3] for details of confidence interval construction 
for uncensored and independent data. 

2.4.2. Blockwise modified bootstrap method. The more computationally 
demanding resampling method offers an alternative approach for statistical 
inference. Here we introduce a modified bootstrap approach through block- 
wise pairs resampling, denoted by Boot. For easy presentation, we denote 

In applications of quantile regression, the pairs bootstrap is often chosen 
over the residual bootstrap because it is insensitive to model misspecifica- 
tion and heteroscedasticity. The idea of the pairs bootstrap is to draw pairs, 
in our case, {yij,Xij), at random from the original observations with replace- 
ment. Note that in model (2.1), the observations are dependent within each 
subject. To retain this dependence structure, we treat the observations in 
each subject as a block and resample the block pairs {{yij,Xij),j = 1, . . . , rij}. 

As we have seen, computation of Powell's estimator is complicated by the 
nonconvexity of the objective function (2.6). Therefore, direct implementa- 
tion of the bootstrap approach could be prohibitively expensive in terms 
of computation. To reduce the computational cost, we employ a modified 
bootstrap method proposed in [2] for median regression with independent 
data. From now on, we define the bootstrap sample of {yij,Xij) by (y*,x^). 

It is known that the solution of mma^RpJ^ijPriyij — xfj^)I{xfj'yQ > 0) is 
asymptotically equivalent to Powell's estimator. Making use of the 7 that 
result from fitting the model with the observed data, the modified bootstrap 
estimator 7^ can be obtained by minimizing 

(2.12) Y.Priy* - xf^)I{xf^ > 0). 

Note that (2.12) is a convex function, and thus 7* can be calculated in the 
same way as in uncensored quantile regression. A 100(1 — 9)% confidence 
interval for 7 can be obtained with the lower and upper bound calculated 
as the (0/2)th and (1 — 0/2)th quantiles of those 7*'s. 

3. Simulation study. To assess the performance of the inference proce- 
dures described in Section 2, we conduct a simulation study. We explore 
the effects of different proportions of censoring and various degrees of intra- 
subject dependency on estimation, testing and confidence intervals. 
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3.1. Model descriptions. In the simulation, the latent response variable 
y* is generated from the following model: 



where Uij = Oj + eij is the random error and F~^(t) is the rth quantile of 
u, Ui is the random subject effect, Xij and eij are i.i.d. from the standard 
normal distribution, Zij = for the first N/2 subjects and Zij = 1 for the 
rest. Four different cases are considered: 

Case 1. A fixed effect model (oj = 0) with homoscedastic term cjjj = 1. 

Case 2. A random effect model with Cj that are i.i.d. from the standard 
normal and Cjj = 1. This yields a homoscedastic model with an intra-subject 
correlation coefficient of 0.5. 

Case 3. A random effect model with Oj that are i.i.d. from A^(0, 9) and 
aij = 1. This yields a homoscedastic model with an intra-subject correlation 
coefficient of 0.9. 

Case 4. A heteroscedastic model with ~ N{0, 1) and aij = 1 + \xij\. 

For all cases we consider both 20% and 40% censoring. The observed re- 
sponse variable Uij is generated from the maximum of and y^j, subtracting 
the 20th or 40th percentile of {yij}, respectively. Our analysis focuses on 
the effect of Zij at three quartiles, as per the main objective of Section 2. To 
evaluate Type I error and power, the nominal significance level is set to 5%, 
a is fixed at 10, (3 is varied from to 1 and the simulation was repeated 500 
times in all cases. 

3.2. Evaluation of the proposed estimator for a. We first compare the 
finite sample efficiency of the omniscient estimator, our estimator d and 
two naive estimators. The omniscient estimator is obtained by fitting the 
quantile regression model with the latent response variable and thus serves 
as a gold standard. The first naive estimator, Naivei, is obtained using all 
observations as if none were censored. The second, Naive2, is computed using 
only the uncensored observations. 

Table 1 summarizes the bias and mean squared error (MSE) of these four 
estimators in Cases 1-4 for = 50 and (3 = 0. Compared to the omniscient 
estimator, a performs universally well, even when the data are highly cor- 
related (Case 3). As expected, the two naive estimators have larger biases 
and mean squared errors than a, even more so for the higher proportion of 
censoring. 



(3.1) 



Vij = 1 + Xija + ZijP + aij{uij - F„ (r)} 



i = l 



...,N, j = l,...,10. 
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Table 1 

Comparison of the omniscient estimator, our estimator a and two naive estimators for a 
at N — 50. The Naive\ is obtained by uncensored quantile regression and Naive2 is 
obtained using only the uncensored observations. The CP stands for the censoring 
proportion and MSE stands for mean squared error 



Omniscient a Naivei Naive2 



CP MSE Bias MSE Bias MSE Bias MSE Bias 



















r ■ 


= 0.25 
















Case 


1 


0.2 


0, 


.004 







0, 


,008 


0.008 


0, 


.360 


-0, 


.591 


0.013 


-0, 


.072 






0.4 


0, 


,004 







0, 


,015 


0.014 


6, 


.791 


-2, 


.576 


0.033 


-0, 


,135 


Case 


2 


0.2 


0, 


.007 


-0, 


,007 


0, 


,017 


0.001 


0, 


.719 


-0, 


,834 


0.043 


-0, 


,159 






0.4 


0, 


,007 


-0, 


,007 


0, 


,032 


0.013 


9, 


.408 


-3 


,039 


0.119 


-0, 


,295 


Case 


3 


0.2 


0, 


,037 


-0, 


,013 


0, 


,093 


0.005 


2, 


.953 


-1, 


.685 


0.699 


-0, 


,762 






0.4 


0, 


,037 


-0, 


,013 


0, 


,180 


0.019 


20, 


.953 


-4, 


.548 


2.266 


-1, 


,427 


Case 


4 


0.2 


0, 


,043 


-0, 


,014 


0, 


,098 


0.017 


2, 


.621 


-1, 


.594 


0.392 


-0, 


,550 






0.4 


0, 


,043 


-0, 


,014 


0, 


,183 


0.051 


16, 


,572 


-4, 


,045 


0.584 


-0, 


,647 


Case 


1 


0.2 


0, 


,003 


0, 


,001 


0, 


T 

,007 


= 0.5 
0.006 


0, 


,717 


-0, 


,838 


0.012 


-0, 


,067 






0.4 


0, 


,003 


0, 


,001 


0, 


,014 


0.004 


10, 


.878 


-3 


,277 


0.033 


-0, 


,136 


Case 


2 


0.2 


0, 


,006 


-0, 


,001 


0, 


,013 


0.002 


1, 


.142 


-1, 


,058 


0.031 


-0, 


,138 






0.4 


0, 


,006 


-0, 


,001 


0, 


,023 


0.007 


11, 


.686 


-3 


.401 


0.094 


-0, 


,267 


Case 


3 


0.2 


0, 


,032 


-0, 


,009 


0, 


,063 


-0.006 


2, 


.782 


-1, 


.646 


0.448 


-0, 


,610 






0.4 


0, 


,032 


-0, 


,009 


0, 


,115 


0.017 


15, 


.583 


-3 


.931 


1.434 


-1, 


,133 


Case 


4 


0.2 


0, 


,021 


0, 


,002 


0, 


,040 


0.009 


2, 


.218 


-1, 


.476 


0.129 


-0, 


,304 






0.4 


0. 


,035 


-0, 


,002 


0, 


,128 


0.035 


17, 


.266 


-4, 


,141 


0.436 


-0, 


,568 




















= 0.75 
















Case 


1 


0.2 


0, 


,004 


0, 


,002 


0, 


,008 


0.006 


2, 


.408 


-1, 


.537 


0.012 


-0, 


,062 






0.4 


0, 


,004 


0, 


,002 


0, 


,015 


0.002 


15, 


.134 


-3 


.879 


0.030 


-0, 


,123 


Case 


2 


0.2 


0, 


,007 


-0 




0, 


,013 


0.005 


2, 


.724 


-1, 


.635 


0.027 


-0, 


,118 






0.4 


0, 


,007 


-0 




0, 


,023 


0.008 


15, 


.157 


-3 


.883 


0.080 


-0, 


,237 


Case 


3 


0.2 


0, 


,038 


-0, 


,008 


0, 


,063 


0.005 


3 


.384 


-1, 


.818 


0.340 


-0, 


,514 






0.4 


0. 


,038 


-0, 


,008 


0, 


,101 


-0.009 


14, 


.717 


-3 


.824 


0.998 


-0, 


,927 


Case 


4 


0.2 


0, 


,040 


-0, 


,001 


0, 


,058 


0.010 


4, 


,420 


-2, 


.085 


0.222 


-0, 


,403 






0.4 


0, 


,040 


-0, 


,001 


0, 


,116 


0.019 


20, 


,051 


-4, 


,467 


0.372 


-0, 


,509 



3.3. Performance of the proposed quantile rank score test. We evaluate 
the performance of our proposed Quantile Rank Score test (QRS) by com- 
paring it to five other test statistics: a rank score test for censored data that 
assumes independence (Indep); a naive rank score test (Naivei) that assumes 
the observations are uncensored; another naive rank score test {Naive2) that 
uses only the uncensored observations; a bootstrap base test, Boot, with 500 
resamplings; and the omniscient rank score test, Omni, based on the latent 
response variable. 
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Table 2 

The Type I errors in Cases 1~4 at N = 10 and N = 50 



Case 1 Case 2 Case 3 Case 4 



T 0.25 0.5 0.75 0.25 0.5 0.75 0.25 0.5 0.75 0.25 0.5 0.75 













CP = 


0.2,7V 


= 10 














Indep 


0.04 


0.06 


0.06 


0.30 


0.30 


0.26 


0.42 


0.43 


0.42 


0.30 


0, 


.22 


0.27 


Naivei 


0.05 


0.05 


0.04 


0.07 


0.07 


0.05 


0.07 


0.07 


0.05 


0.07 


0, 


.06 


0.04 


Nawe2 


0.04 


0.08 


0.09 


0.07 


0.07 


0.09 


0.09 


0.09 


0.13 


0.07 


0, 


.08 


0.12 


Omni 


0.05 


0.05 


0.06 


0.06 


0.06 


0.05 


0.07 


0.06 


0.06 


0.05 


0, 


.05 


0.06 




U.Oo 


U.Oo 


U.U6 


U.U6 


U.UO 


U.UO 


O.Ol 


O.UD 


U.U4 


U.UD 


0, 


.Uo 


U.U6 


ID „i 

Boot 


0.05 


0.07 


0.07 


0.09 


0.08 


0.09 


0.09 


0.07 


0.10 


0.10 


0, 


.08 


0.04 












CP = 


0.4,7V 


= 10 














Indep 


0.05 


0.07 


0.05 


0.26 


0.24 


0.23 


0.37 


0.39 


0.38 


0.26 


0, 


,19 


0.22 


Naivei 


0.04 


0.06 


0.04 


0.05 


0.04 


0.04 


0.06 


0.06 


0.05 


0.06 


0, 


,05 


0.05 


Nawe2 


0.05 


0.09 


0.10 


0.10 


0.09 


0.12 


0.12 


0.14 


0.23 


0.11 


0, 


,12 


0.13 


Omni 


0.05 


0.05 


0.06 


0.06 


0.06 


0.05 


0.07 


0.06 


0.06 


0.05 


0, 


,05 


0.06 


QRS 


0.04 


0.08 


0.05 


0.06 


0.07 


0.06 


0.07 


0.07 


0.07 


0.06 


0, 


,07 


0.05 


Boot 


0.05 


0.08 


0.06 


0.08 


0.07 


0.10 


0.10 


0.08 


0.13 


0.09 


0, 


,10 


0.10 












CP = 


0.2,7V 


= 50 














Indep 


0.05 


0.06 


0.04 


0.29 


0.27 


0.29 


0.44 


0.40 


0.43 


0.28 


0, 


,20 


0.28 


Naivei 


0.05 


0.04 


0.04 


0.06 


0.04 


0.06 


0.05 


0.05 


0.03 


0.05 


0, 


,05 


0.05 


Naive2 


0.06 


0.06 


0.06 


0.05 


0.05 


0.06 


0.05 


0.06 


0.08 


0.05 


0, 


,06 


0.06 


Omni 


0.05 


0.05 


0.04 


0.04 


0.05 


0.04 


0.05 


0.04 


0.05 


0.04 


0, 


,06 


0.04 


QRS 


0.05 


0.06 


0.05 


0.04 


0.04 


0.04 


0.06 


0.04 


0.04 


0.04 


0, 


,06 


0.04 


Boot 


0.04 


0.06 


0.04 


0.05 


0.05 


0.06 


0.06 


0.04 


0.06 


0.05 


0, 


,06 


0.05 












CP = 


0.4,7V 


= 50 














Indep 


0.06 


0.05 


0.05 


0.22 


0.24 


0.22 


0.37 


0.33 


0.35 


0.23 


0, 


,17 


0.23 


Naivei 


0.05 


0.04 


0.06 


0.04 


0.04 


0.08 


0.05 


0.04 


0.06 


0.06 


0, 


,06 


0.05 


Naive2 


0.05 


0.06 


0.06 


0.06 


0.05 


0.08 


0.07 


0.08 


0.10 


0.05 


0, 


,06 


0.09 


Omni 


0.05 


0.05 


0.04 


0.04 


0.05 


0.04 


0.05 


0.04 


0.05 


0.04 


0, 


,06 


0.04 


QRS 


0.05 


0.05 


0.04 


0.04 


0.04 


0.04 


0.06 


0.04 


0.04 


0.04 


0, 


,04 


0.04 


Boot 


0.04 


0.06 


0.05 


0.04 


0.04 


0.05 


0.06 


0.06 


0.07 


0.03 


0, 


,05 


0.05 



Table 2 summarizes the Type I error rates of all six test statistics in 
Cases 1-4 for = 10 or 50. The Type I error of Boot is estimated by 
the proportion of cases where is not contained in the 95% confidence 
interval for [3. It is obvious that the rank score test Indep without the 5 
adjustment loses complete control of Type I error in Cases 2, 3 and 4, where 
the data are correlated due to the random effects ai. Moreover, the size 
of the deterioration increases with the degree of intra-subject dependency. 
As to the naive tests, we find that, in general, Nawe2 has inflated Type 
I errors at = 10, and Naivei lacks power (see Figure 1). The modified 
bootstrap method. Boot, preserves the nominal significance level well at 



INFERENCE FOR CENSORED QUANTILE REGRESSION MODELS 13 













// y 






















0.0 


O.S 1 .0 

p 

Case 4 


1.6 2.0 




Fig. 1. Power curves of QRS, Boot, Omni and Nai'vei in Cases 1-4 at r = 0.75 with 
N — 50 and 20% censoring. 



N = 50, but gives consistently inflated Type I errors in the smaller samples 
N = 10. Generally speaking, the QRS and omniscient methods preserve the 
nominal significance level reasonably well in all cases. 

Figure 1 plots the power curves of QRS, Boot, Omni and Nai'vei at 
r = 0.75 with = 50 and 20% censoring. The Nai'vei loses a great deal 
of power by ignoring the censoring. The QRS and Boot both perform as 
well as the omniscient method in all cases. Using the modified bootstrap 
approach reduces the computational time as compared to the direct imple- 
mentation of bootstrap, but it is still much more computationally intensive 
than QRS. For example, using R (version 2.3.1) in a 3.4 GHz Dell computer 
with 3.0 GB of RAM to simulate Case 2 at r = 0.75 with iV = 10 and 20% 
censoring, the QRS took 23 seconds for 500 runs of simulation, compared 
to 3,013 seconds by Boot with 500 resamples. Furthermore, QRS is robust 
to the heteroscedasticity considered in Case 4, even though it is developed 
for models with homoscedastic errors. 

We study the performance of QRS under the local alternative Hn : P = 
n-V2/5Q for n = WN varying from 200 to 5000. We let Po = 10 in Case 1, 

= 20 in Cases 2 and 4 and /3o = 50 in Case 3. Figure 2 shows that the local 
power of QRS remains stable as n increases. This observation is consistent 



14 



H. WANG AND M. FYGENSON 



Case 1 
Case 2 
Cases 
Case 4 



4000 



n 



Fig. 2. Power curves of QRS under local alternative Hn : f3 = 50n ^^'^ m Cases 1-4 at 
T = 0.5 with 20% censoring. All the curves are generated by fitting smoothing splines over 



the estimated powers against n. 



with the asymptotic results in Section 2.3. Note that QRS exhibits different 
powers in four cases due to the different variation used to generate the 
subject effects. 

3.3.1. Assessment of confidence intervals for (3. For each simulated data 
set under the hypothesis /3 = 1, we obtain 95% confidence intervals for {3 us- 
ing the bootstrap method, Boot, and by inverting the QRS test following 
the procedure described in Section 2.4. For comparison, we also obtain con- 
fidence intervals by inverting the Omni test and the two naive tests, Naivci 
and Naive2- The estimated mean lengths (EML) of these five confidence 
interval procedures and the empirical coverage probabilities (ECP) across 
the 500 intervals in Case 2 at = 50 are summarized in Table 3. 

The two nai've methods give poor confidence intervals at all quantile levels. 
The empirical coverage probabilities of Omni, QRS and Boot are, in general, 
close to the nominal level. Moreover, the mean lengths of the QRS confidence 
intervals are comparable to those of the Omni. 
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4. Application to an HIV-RNA level study. In this section we apply 
the methodology proposed in Section 2 to analyze the HIV-RNA data in 
[10]. This clinical trial followed a total of 481 HIV-infected individuals with 
baseline HIV-RNA levels in their plasma (viral load) greater than 1000 
copies/ml. For each individual, viral load was measured at time zero and 
then approximately 2, 4, 8, 16 and 24 weeks later. Due to the detection 
limit of the assay used to measure viral load, 22% of measurements were 
censored from below at 200 copies/ml. We refer the readers to [10] for a 
more thorough discussion of this AIDS-related clinical trail. We seek to 
compare the viral load response (VL) to a double protease inhibitor (DPI) 
regimen, herein referred to as the treatment, with that of a single protease 
inhibitor (SPI) regimen, herein referred to as the control. 

Our preliminary investigation shows that the log;^oVL from both regimens 
drops sharply during the first two weeks, and then remains rather stable. To 
capture this evolution pattern, we consider a model that includes an inter- 
cept, a slope for the first two weeks, and a different slope for the remainder 
of the study. The working model is 

Vij = max{logio(200), /3o + Pi min(%, 2) + p2{tij - 2)1 {Uj > 2) 

(4.1) 

+ 71 mm{tij,2)zi + 72(% - 2)I{tij > 2)zi + u.^}, 

where yij is the observed log^gVL of the iih. subject at time tij, Zi is the 
treatment indicator taking 1 for the treatment group and for the control. 

We fit model (4.1) at different quantiles with r varying from 0.25 to 0.9 to 
obtain a profile of the regimens' effects. Sun and Wu [28] analyzed the same 
data using a partial linear model. However, they ignored the left censored 
observations and used only those responses within the detectable range. We 
shall see that doing so leads to biased results and distorted conclusions about 
the treatment's efficacy. 

Figure 3 shows the estimated quartiles of two regimens at each time point 
from our method (curves with solid points), and those from Naive2 that use 



Table 3 

The empirical coverage probabilities (ECP) and estimated mean lengths (EML) for 
confidence interval procedures in Case 2 at N — 50. The nominal level is 0.95 



Method 


T 


= 0.25 


r 


= 0.50 


T 


= 0.75 


ECP 


Length 


ECP 


Length 


ECP 


Length 


Omni 


0.95 


0.79 


0.94 


0.74 


0.96 


0.78 


QRS 


0.96 


1.00 


0.93 


0.79 


0.95 


0.80 


Boot 


0.95 


1.26 


0.95 


1.26 


0.95 


1.26 


Naivei 


0.34 


0.56 


0.77 


0.63 


0.92 


0.72 


Naive2 


0.28 


0.54 


0.59 


0.55 


0.71 


0.56 
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Fig. 3. Estimated quartiles of log^gVL at six visits. The curves with solid dots are from 
the censored regression and those without dots are from Naive2 method. The dashed, solid 
and dotted curves are for r = 0.25,0.5 and 0.75, respectively. 



only the uncensored observations. As expected, the naive method overesti- 
mates VL, especiahy in the lower quantiles. At r = 0.25, our method shows 
that the VL drops rapidly during the first two weeks, and it continues drop- 
ping afterward for both regimens, while Nawe2 suggests an increasing trend 
after week 2. 

To analyze the treatment effect and demonstrate the importance of ac- 
counting for censoring and dependency in the data, we tested a series of hy- 
potheses. Table 4 describes these hypotheses and their p-values from QRS, 
Nawe2 and Indep at several r's. Note that these significance results are 
for individual r's, but not from simultaneous tests. For this data, the 5 
is estimated to be 0.37 at median corresponding to a sign correlation of 
(0.37 — 0.25)/0.25 = 0.48. The following are highlights of the interesting find- 
ings in the table. 

• At r = 0.4 and 0.5, our method and the Indep method indicate that the 
treatment is significantly better than the control after week 2. By contrast, 
the Nawe2 method indicates no significant difference. 
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Table 4 

Results of inference on 71 and 72 at several quantile levels 



T 


Coefficient estimation 






p- value 






Parameter 


QRS 


NaiVe2 


Hypothesis 




QRS 


Naive2 


Indep 


0.4 


71 


0.067 


0.001 


-ffo : 71 = 72 = 





0.022 


0.965 


0.100 




72 


-0.118 


0.002 


i?o:7i =0 




0.126 


0.930 


0.066 










-ffo : 72 = 




0.001 


0.841 


0.015 


0.5 


71 


-0.009 


-0.042 


-ffo : 71 = 72 = 





0.001 


0.717 


0.001 




72 


-0.035 


0.005 


Ho : 71 = 




0.900 


0.507 


0.872 










-ffo : 72 = 







0.672 


0.002 


0.6 


71 


-0.058 


0.033 


-ffo : 71 = 72 = 





0.192 


0.844 


0.067 




72 


-0.016 


-0.009 


i?o : 71 = 




0.494 


0.776 


0.377 










-ffo : 72 = 




0.045 


0.583 


0.150 


0.7 


71 


-0.051 


-0.047 


-ffo : 71 = 72 = 





0.291 


0.653 


0.213 




72 


-0.009 


0.002 


i^o : 71 = 




0.708 


0.433 


0.628 










-ffo : 72 = 




0.173 


0.939 


0.335 



• At T = 0.6, our method indicates that the treatment is significantly more 
favorable (at the 5% level) than the control after week 2. By contrast, 
neither Nawe2 nor Indep indicates a significant difference. 

• For the quantile level r = 0.7, there is no significant difference between 
the treatment and the control throughout the trial period — a conclusion 
supported by all three methods. 

Figure 4 highlights the importance of fitting a variety of quantile models 
to the data. The solid line with open circles in each panel depicts the point 
estimates, with the shaded area representing a 95% pointwise confidence 
band following our method. The dashed line represents the mean effect ob- 
tained from Tobin's normal censored regression model, with two dotted lines 
representing a 95% pointwise confidence band for that effect. The confidence 
interval for the mean effects is computed using the bootstrap method and 
treating each subject as a single unit. In [28], the p- value for testing the 
interaction effect was 0.0228, suggesting that the logigVL of DPI drops sig- 
nificantly faster than that of SPI throughout the trial period. Our method, 
however, suggests that two regimens do not differ during the first two weeks 
across all the quantile levels [Figure 4(a)]. From week 2 to week 24, the 
mean regression method shows that treatment is more favorable than the 
control. Quantile regression indicates that this difference mainly comes from 
the lower tail of the log^gVL distribution (0.25 < r < 0.45). 

5. Discussion and conclusions. In this paper we introduced inference pro- 
cedures for longitudinal data with fixed censoring within the robust frame- 
work of a semi-parametric quantile regression model. One main focus was 
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(a) Ti (b) ti 




0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

Quantile Quantile 

Fig. 4. The quantile estimates (open circles) of and 72. In each panel, the shaded 
area depicts a 95% pointwise confidence band for the quantile coefficient, the dashed line 
represents the mean coefficient estimate with two dotted lines representing a 95% pointwise 
confidence hand for the mean effect. 

on providing test procedures for comparing treatments and for assessing the 
influence of a subset of the covariates. Our proposed quantile rank score test 
avoids the need for estimating an unknown density. It is relatively easy to 
implement and performs well in empirical investigations. In particular, by 
applying our test statistics to data from an AIDS-related clinical trial, we 
demonstrated the importance of separately considering the various quan- 
tiles when assessing the relative merits of different treatments. Moreover, 
our conclusions could not have been reached by methods that ignore either 
censoring or intra-subject dependency in the data. 

The quantile estimate a we employed in the current paper is derived under 
the working assumption of independence. Efficiency might be gained by in- 
corporating appropriate weights to account for the intra-subject correlation 
structure, as done in [17] for uncensored data. However, He, Fu and Fung 
[11] and Yin and Cai [33] found, albeit in different contexts, that the effi- 
ciency gain in doing so is minimal in finite samples unless the intra-subject 
correlation is extremely high. 
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In Section 3, we explored, via Case 4, the robustness of our proposed in- 
ference procedure to the assumption of homogeneity of the error terms. Our 
simulations indicate that the proposed procedures, and the quantile rank 
test in particular, perform robustly. These findings are encouraging. In a 
future study, we plan to generalize our methodology to cover models with 
heteroscedastic errors. In the mean time, it is clear from our proof in the 
Appendix that for such models our estimator for ao is still strongly consis- 
tent and asymptotically normal, but with a modified asymptotic variance— 
covariance matrix. Without an appropriate weighting, however, the limiting 
distribution of our rank score test is no longer that of a chi-squared distri- 
bution. 

In this paper, the focus was on an exchangeable correlation structure 
where 6 = P{uij < 0,Uij' < 0) is taken to be the same for all pairs of errors. 
Indeed, one can apply the methodology developed here to situations with 
more general intra-subject dependency structures, as long as one can obtain 
a consistent estimator for the variance-covariance matrix Cov{ip{U)). 



APPENDIX 

A.l. Proof of Theorem 2.1. We first show the strong consistency of 
by modifying the proof of Theorem 1 in [24] to cover censored regression 
models for longitudinal data. By the continuity of Qn(«) defined in (2.6), it 
suffices to show the strong consistency of uq under Hq. 

Define 

u*j = Uij — max(0, x^qq) = max(0, x^oq + Uij) — max(0, xfjao) 

and 

hij{a,ao) = max(0, x^a) — max(0, x^ao). 
Note that the minimization of (5n(o) is equivalent to the minimization of 

N TH 

Qn{a) =n~^^^p^{yij -max(0,x^a)} 
i=ij=i 

N n, 

- n"^ ^ ^ PriVij - max(0, xf^ao)} 
i=\j=\ 

N n, 
i=lj = l 

N TH 

i=ij=i 
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Then each Rij = {pr {u*j — hij ) — p-j- {u*j ) } is bounded by 1 1 1 1 ( 1 1 a 1 1 + 1 1 oq 1 1 ) = 
0(||xjj||). By condition A2, Levy's theorem [26] and Lemma 2.2 of [31], oq 
will be strongly consistent if the conditional expectation of Qn{ci) given the 
covariates 

(A.l) E[QUa)\{x^j}]^Qnia) 

is strictly positive for ||a — aoll ^ £ for arbitrary e > and all n sufficiently 
large. 

From the derivation of the conditional expectation in (A.l), it can be 
shown that 

Qn{a) > n-^5]/(x^.Qo > 0,xla > 0) P^(x^ A - A)/(A)dA 

(A.2) 

/u 
(A + x^ao)/(A) dX, 

where A = a — ao- 

Following the same argument as in (A. 10) of Powell [24], condition A4 
and inequality (A.2) yield 

(A.3) Q„(a) > l/2£)ic2n-i5]/(4ao >eo)/(|2;5A| >c) 

for any positive number c < min(eo, £?i), where £q is defined in condition 
A3. This completes the proof of strong consistency because the inequality 
in (A.3) is strictly positive by conditions A2 and A3. 

The asymptotic normality follows from the application of Liapunov's cen- 
tral limit theorem to the following lemma: 

Lemma A.l. Under (2.5) and conditions A1-A6, we have 

n^'^i&o - ceo) = n-i/2|^(o)|-i^-i^ j(^T^^ > Q)xij^r{u^J) 

(A.4) 

+ D^^D2(3o + Op{l). 

Proof. Define 

rii 

ipiia) = lixfja > 0)xijipr{uij + n'^^'^zJ^Po - xjj{a - ao)}, 

TV 

A„(a) = ^£;{^i(a)}, Ui{a,d)= sup ||^/'i(7) - ^i(a)||. 

i=l Il7-a||<'^ 
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Note that ^p^{a) are independent and 

rii 

(A. 5) i'iiao) = I{xfjao > 0)xijipr{yij - xfjUo). 

i=i 

Since minimizes the objective function Qnict) in (2.6) for all a, the 
directional derivative of Qn(tt) at a in the direction of a unit vector v is 
nonnegative. That is, lim/i|o{Qn(«o + hv) — Qn{o:o)}/h > 0. This implies 
that 

<- I{xJj^>^)iPr{yij-xl-w)N^Xiy 
yij=xJjao 

However, the right-hand side is bounded by 

(A.6) 11^^.11- 

yij=x'[jao 

By conditions A4 and A5 and the strong consistency of oq, we have al- 
most surely a finite number of observations with zero residuals and therefore 
the quantity in (A.6) is equal to o(n^/^7„) by condition A2, where 7^ is a 
sequence of positive numbers going to infinity. 

Thus, we have 

(A.7) J2M^o) = o{n'/^jn). 

i=l 

Since (A.7) is the same as (2.1) in [12], the Bahadur representation in 
(A. 4) will follow from their Theorem 2.2 if conditions B1-B4, B5' and B8 in 
that theorem hold. 

Bl. The measurability is trivially satisfied. 

B2. This follows directly from the strong consistency of cxq. 

B3. With some manipulations, we obtain 

Ui{a,d)< sup TY\\xij\\I{\xJja\ <\xjj{a- ■y)\} 

I7— o||<c! j 

+ sup Y\\xij\\I{\uij - xfja + n~'^/^z'[jPo\ <\xfj{a--f)\} 

\\'y—ct\\<d j 

<TCii + Bi, 
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where Bi = J2j \\xij\\I{\uij - xfja + n~^/'^z'[jf3o\ < \\xij\\d) and Cn = J2j 

I{\xfja\ < \\xij\\d). Denote Cj2 = J2j W^ijll'^^il^Tj'^l ^ ll^^iiMI- Then, by con- 
ditions A2 and A4, we have 

E{Bi) = Y,\\xij\\P{\u^J-xla + n~^/^zJJPo\ < \\xij\\d} 
j 

= E + 0{d^) = Wad + 0{d^), 

j 

j 

where Wn and Wi2 are some positive constants. Therefore, condition B3 
fohows with tti = d~^T'^niCi2 + 2TCiiWii + Wi2. 

B4. Under condition A2, we have = J2i = 0{n). Thus = 0{An). 
B5'. By condition A2, maxjUj < njmaxjj \\xij\\ = 0(n^/^). B5' follows by 
taking dn = n~^/^(logn)^. 

B8. Note that 

An (a) = ^I{xfja > 0)xij[T - F{xfj{a - qq) - n'^/"^ zf^fio}] 

(A.8) 

= -n/(0)Di„(a)(a - qq) + n^''' f{^)D2n{a)[3o + 0(1). 
Therefore we have 

(A.9) A„(ao) = -n/(0)Di„(do)(ao - ao) + 0(1), 

where ao = ao + "'~^''^{-Din(ao)}~^-C)2n(ao)/?o- Thus B8 holds with 6„ = 
0(1) and Dn = —nf{0)Din{ao). It then follows from the consistency of ao 
and Theorem 2.2 of [12] that 

n'/\ao - do) = n-'^''{ifiO)y'{Diniao)}~'Y.M^o) + Op(l) 

i 

(A.IO) =n-i/2{/(o)}-i{I)i„(ao)}-i5^/(x^ao>0)x,j 

X ipr{uij + n'^/'^zfjPo - xJj{do - ao)} + Op(l). 
By expanding ipT-{u + e) around u, we obtain 

n^/2(ao - ao) 

= n-^D^^ J2 ^(4«o > 0)x,, (4/?o - xJjDY^D2Po) 

(A.ll) 

= n-i/2{/(o)}-ii^-i ^/(x^.ao > 0)x,,ipr{uij) + Op(l). 



INFERENCE FOR CENSORED QUANTILE REGRESSION MODELS 23 
This completes the proof of Lemma A.l. □ 

A. 2. Proof of Theorem 2.2. The proof of Theorem 2.2 rehes on the fol- 
lowing two lemmas: 

Lemma A. 2. Define 

where u*- = yij — max(0, xj-a^ + n~^/'^zfjPo)- Then under Hn and conditions 
A1-A7, we have 5^ = 5^ + n"V(0) Tij lixfjao > 0)z*jzfjPo + Op(l). 

Proof. First note that 



(A.12) 



X {max(0, Uij + xfjao + n ^^'^zfjPo) 
— max(0, xfjUo + n~^^'^zfjf3o)} 
= ^ z*jl{xfja + n-^/^zfjPo > 0)ipriu^j). 

ij 

For any fixed t such that ||t|| < C, we define 

rii 

Ri{t) = z*j[I{xfjao + n~^/'^xfjt > 0) 

X '^riVij — max(0,x^ao + n~^^'^xjjt)} 

- I{xJjOiQ + n~'^/'^zJj/3o > 0)(Pr{uij)] 

rii 

= E z*j{I{xfjao + n-^'^xlt > 0) 
i=i 

X ipriuij + n~^/'^zJj(3o - n~^/'^xjjt) 

- I{xJjao + n'^^'^zfjPo > 0)ipr{uij)}. 
By condition A4, each coordinate R\''\t), k = 1, . . . ,q, satisfies 
EVar{i?f)(t)} 

i 

i j 

^ E^»E Il4ll^{^2|n^i/2x^.t - n~^/^zfjpo\+I{\ xfjao |< djxi\\)}, 



24 H. WANG AND M. FYGENSON 

where (i„ = n~-'-/^||xij||""'"max(||3;jj||, ||zij||) • max(||/3o||, ||t||) = 0{n~^^^). It 
then follows from condition A5 that 

(A.13) ^ Var{i?f = 0{n^/^). 

i 

Under condition A2, it is clear that maxj \\Ri{t)\\ <Cn^/^ for some constant 
C . It follows from the Hoeffding inequality and the chaining argument that 



(A. 14) sup 

\\t\\<c 



0.p{n'/\\ognf/^). 



Under conditions A2 and A4 and using the orthogonality of Z* and {I{XaQ > 
0) (>i)l^}X, we obtain that 



E E 4{^(4«o + n-^'xlt > 0)(r - F(n-V2^T, _ n" V^^^^o))} 
« j 

'/'/(O) E 4^(4«o > 0)4/3o + 0(1). 



n 



This, together with Theorem 2.1 and (A. 14), completes the proof. □ 

p 

Lemma A. 3. Under Hn and conditions A1-A7, we have 5 — > 6, as 
n — > oo. 

Proof. Recah that 6 = J2ij^j' Hxfjao > 0, xfj,ao > 0)I{uij < 0, Uij' < 
0). We define 5* = L~^J2ij^j, I{xfjao + n-^/hJjPo > 0,xfj,ao + n-^/^zJj,(3o > 
0)I{u*j < 0,u*j, < 0), where u*j = yij - max(0,x^ao + n~^/'^z'[j(3o). Using 
Lemma 4.1 of [12] and the root-n consistency of ao, we can establish that 
6 — 5* = Op(l). Lemma A. 3 thus follows by applying the weak law of large 
numbers. □ 

Proof of Theorem 2.2. Denote Ri = J2j I{xfjao + n~'^^'^zfjPo)z*jipr{uij). 
Note that S* = J2i is the summation of independent entries. It fol- 
lows from the Lindberg-Feller central limit theorem (CLT) that 

(A.15) {K(<5;ao)}-^/'5: ^ Af(0„/g). 

The proof of Theorem 2.2 is therefore complete by combining (A.15), Lem- 
mas A. 2 and A. 3. □ 
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